475 research outputs found

    Extracting Word Sequence Correspondences with Support Vector Machines

    Get PDF
    method of word sequence correspondences from non-aligned parallel corpora with Support Vector Machines, which have high ability of the generalization, rarely cause over-fit for training samples and can learn dependencies of features by using a kernel function. Our method uses features for the translation model which use the translation dictionary, the number of words, part-of-speech, constituent words and neighbor words. Experiment results in which Japanese and English parallel corpora are used archived 81.1 % precision rate and 69.0 % recall rate of the extracted word sequence correspondences. This demonstrates that our method could reduce the cost for making translation dictionaries

    Integer programming for selecting set of informative markers in paternity inference

    Get PDF
    BACKGROUND: Parentage information is fundamental to various life sciences. Recent advances in sequencing technologies have made it possible to accurately infer parentage even in non-model species. The optimization of sets of genome-wide markers is valuable for cost-effective applications but requires extremely large amounts of computation, which presses for the development of new efficient algorithms. RESULTS: Here, for a closed half-sib population, we generalized the process of marker loci selection as a binary integer programming problem. The proposed systematic formulation considered marker localization and the family structure of the potential parental population, resulting in an accurate assignment with a small set of markers. We also proposed an efficient heuristic approach, which effectively improved the number of markers, localization, and tolerance to missing data of the set. Applying this method to the actual genotypes of apple (Malus × domestica) germplasm, we identified a set of 34 SNP markers that distinguished 300 potential parents crossed to a particular cultivar with a greater than 99% accuracy. CONCLUSIONS: We present a novel approach for selecting informative markers based on binary integer programming. Since the data generated by high-throughput sequencing technology far exceeds the requirement for parentage assignment, a combination of the systematic marker selection with targeted SNP genotyping, such as KASP, allows flexibly enlarging the analysis up to a scale that has been unrealistic in various species. The method developed in this study can be directly applied to unsolved large-scale problems in breeding, reproduction, and ecological research, and is expected to lead to novel knowledge in various biological fields. The implementation is available at https://github.com/SoNishiyama/IP-SIMPAT

    経腹部超音波検査による胃癌の壁深達度評価

    Get PDF
    Background: Although endoscopy and endoscopic ultrasonography are generally used to diagnose the depth of gastric tumor invasion, endoscopy is invasive and frequently results in patient discomfort. Transabdominal ultrasonography (TUS) is noninvasive and may be useful in determining this depth. We investigated the usefulness of TUS in determining the depth of tumor invasion in patients with gastric cancer. Methods: This retrospective study included 190 patients with gastric cancer and 200 lesions who underwent curative resection at the Department of Gastrointestinal Surgery of Tottori University Hospital from July 2007 to July 2015. The results of conventional diagnostic imaging and TUS were compared with those of pathological analysis obtained after surgery. Furthermore, the ruptured form of the third layer on TUS imaging was reviewed and investigated to differentiate between the SM2 and MP lesions. Results: The accuracy of TUS was similar to that of conventional diagnostic imaging for all depths of tumor invasion. Eight lesions could not be assessed by TUS, including four that could not be identified and four in which TUS was unable to diagnose the depth. In cases where the ruptured form of the third layer could be determined in MP lesions, the forms were observed toward the inside of the gastric lumen. Conclusion: The results of this study suggested that the accuracy of TUS was equivalent to that of conventional diagnostic imaging in determining the depth of tumor invasion. TUS assessment criteria may be useful to classify this depth. Furthermore, the ruptured form of the third layer is believed to be important in distinguishing between early and advanced gastric cancer

    Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators

    Get PDF
    Identification of non-protein-coding RNAs (ncRNAs) in genomes is a crucial task for not only molecular cell biology but also bioinformatics. Secondary structures of ncRNAs are employed as a key feature of ncRNA analysis since biological functions of ncRNAs are deeply related to their secondary structures. Although the minimum free energy (MFE) structure of an RNA sequence is regarded as the most stable structure, MFE alone could not be an appropriate measure for identifying ncRNAs since the free energy is heavily biased by the nucleotide composition. Therefore, instead of MFE itself, several alternative measures for identifying ncRNAs have been proposed such as the structure conservation index (SCI) and the base pair distance (BPD), both of which employ MFE structures. However, these measurements are unfortunately not suitable for identifying ncRNAs in some cases including the genome-wide search and incur high false discovery rate. In this study, we propose improved measurements based on SCI and BPD, applying generalized centroid estimators to incorporate the robustness against low quality multiple alignments. Our experiments show that our proposed methods achieve higher accuracy than the original SCI and BPD for not only human-curated structural alignments but also low quality alignments produced by CLUSTAL W. Furthermore, the centroid-based SCI on CLUSTAL W alignments is more accurate than or comparable with that of the original SCI on structural alignments generated with RAF, a high quality structural aligner, for which twofold expensive computational time is required on average. We conclude that our methods are more suitable for genome-wide alignments which are of low quality from the point of view on secondary structures than the original SCI and BPD

    The Stability of Sustainable Development Path and Institutions: Evidence from Genuine Savings Indicators

    Get PDF
    This paper investigates institutional factors affecting the performance of genuine savings (GS), which is often used in assessing sustainable development, adopting a model of autoregressive conditional heteroscedasticity in mean. We pay particular attention to the contribution of institutions to decrease the volaticility level of the GS path. Using GS data from the World Bank’s World Development Indicators, and institutional data in the International Country Risk Guide, the estimation results show that there are two ways, through which institutions affecting GS performance. First, the high quality of the institutions enhance GS level directly. Second, the high quality of institutions enhance the GS level via stabilizing the volatiligy of the GS path. Considering both effect in their totality, institutional improvement plays an important role in realizing a sustainable development path
    corecore